Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudv2: Higher resolution for Histograms #3302

Merged
merged 3 commits into from
Aug 31, 2023
Merged

Conversation

codebien
Copy link
Contributor

@codebien codebien commented Aug 24, 2023

What?

It expands the resolution for Histograms, setting 0.001 as the minimum resolution.

Why?

It allows us to support decimal numbers up to 3 digits, increasing the detail of the values that we can store.

@codebien codebien self-assigned this Aug 24, 2023
@CLAassistant
Copy link

CLAassistant commented Aug 24, 2023

CLA assistant check
All committers have signed the CLA.

@codecov-commenter
Copy link

codecov-commenter commented Aug 24, 2023

Codecov Report

Merging #3302 (180fa7c) into master (d5b28fb) will decrease coverage by 0.02%.
Report is 2 commits behind head on master.
The diff coverage is 84.21%.

❗ Current head 180fa7c differs from pull request most recent head a7dec90. Consider uploading reports for the commit a7dec90 to get more accurate results

@@            Coverage Diff             @@
##           master    #3302      +/-   ##
==========================================
- Coverage   73.21%   73.19%   -0.02%     
==========================================
  Files         258      258              
  Lines       19886    19895       +9     
==========================================
+ Hits        14559    14563       +4     
- Misses       4404     4408       +4     
- Partials      923      924       +1     
Flag Coverage Δ
ubuntu 73.13% <84.21%> (-0.03%) ⬇️
windows 73.07% <84.21%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
js/modules/k6/grpc/client.go 84.35% <62.50%> (-0.34%) ⬇️
output/cloud/expv2/hdr.go 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

@codebien codebien force-pushed the cloudv2-hdr-higher-res branch 2 times, most recently from 0a58f7c to 0a0c400 Compare August 24, 2023 14:51
@codebien codebien added this to the v0.47.0 milestone Aug 24, 2023
@codebien codebien requested a review from mstoykov August 24, 2023 14:55
@codebien codebien marked this pull request as ready for review August 24, 2023 14:56
@codebien codebien requested review from a team and olegbespalov and removed request for a team August 24, 2023 15:10
mstoykov
mstoykov previously approved these changes Aug 25, 2023
Copy link
Contributor

@mstoykov mstoykov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general 👍

I left some not blocking comments.

But I would prefer if we get some tests with values between 0 and 1 which is what this was about. Maybe even with some values that the browser team think are representable.

output/cloud/expv2/hdr.go Outdated Show resolved Hide resolved
@@ -151,6 +164,9 @@ func histogramAsProto(h *histogram, time int64) *pbcloud.TrendHdrValue {
if h.ExtraHighBucket > 0 {
hval.ExtraHighValuesCounter = &h.ExtraHighBucket
}
// We don't expect to change the minimum resolution at runtime
// so a pointer is safe here
hval.MinResolution = &h.MinimumResolution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there are reason to use a pointer though?

It won't save memory and will add dereferencing the point when we have to actually write it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the problem is that this is because in the protobuf definition this is not required 🤦

Not certain we can do anythign about this 🤷

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the reason is how Protobuf defines optional values. We haven't released yet the new cloud output v2 as the default so we could think of migrating directly to a new Protobuf version making it required. But my feeling is that it would require a maintenance window.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that if we un-optional this field, this doesn't really change anything for the backend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@esquonk if the suggestion is to change it only on the client then it would require having different proto file versions between the backend and client, that doesn't sound optimal to me.
Instead, if you mean for both of them, then In that case, the problem is how to handle the versioning. If we set the field on the backend as required then the already k6 v0.46.0 released version will stop working if we force the K6_CLOUD_API_VERSION=2 overwriting it from the backend. We could consider it as a no-problem as we are pretty confident that we will set it directly from the code releasing v0.47.0, but I'm not sure. Thoughts?

olegbespalov
olegbespalov previously approved these changes Aug 25, 2023
Copy link
Contributor

@olegbespalov olegbespalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but joining to Mihail on his comments 👍

@codebien codebien dismissed stale reviews from olegbespalov and mstoykov via 6143178 August 25, 2023 08:30
olegbespalov
olegbespalov previously approved these changes Aug 25, 2023
mstoykov
mstoykov previously approved these changes Aug 25, 2023
Copy link
Contributor

@mstoykov mstoykov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Do we have any statistics on how much this change makes changes over 1 less accurate?

I guess it will be fairly good to have some kind of graph showing how accurate the samples are between the values 0 and ... something Less say whatever 1 minute is in ms so 1 *60 * 100 = 6000?

@mstoykov
Copy link
Contributor

That graph is mostly interesting for the release notes, but it will still be good to have it ealier IMO.

@codebien
Copy link
Contributor Author

We should not lose any precision because what we are doing is just using a factor for moving the values across the histogram. The relative error rate is generated from the number of buckets that we have in the range, and it is a constant value. In our case the error rate is ~1%.

Formula Precision =  100% / (number of sub-buckets in a major bucket) = 100 / 2^7 = 0.78

So the boundaries of each value are just upscaled for storage and downscaled for query but they will be the same.

Resolution Seconds (v/min_res) In Millis (v*1e3) Lower limit [ millis - (millis * 1%) ]+ Upper limit [ millis + (millis * 1%) ]
1.0 60 60k 59.4k 60.6k
0.001 60k 60M 59.4M 60.6M

@mstoykov does it make sense for you?

@mstoykov
Copy link
Contributor

mstoykov commented Aug 29, 2023

@codebien it makes some sense ... but we still change the number of buckets.

From my experiments it is barely noticeable in the values we have.

func TestSequence(t *testing.T) {
	resolution := float64(0.001)
	first := float64(50_000)
	last := resolveBucketIndex(float64(first) / resolution)
	first_last := last
	max := float64(60_000)
	increment := float64(1)
	for i := first; i <= max; i += increment {
		index := resolveBucketIndex(float64(i) / resolution)
		if last != index {
			// fmt.Println(last, index, i)
			last = index
		}
	}
	fmt.Printf("buckets total in range %f to %f with resolution %f are %d\n", first, max, resolution, last-first_last)
}

For example will result in

buckets total in range 50000.000000 to 60000.000000 with resolution 0.001000 are 38

vs

buckets total in range 50000.000000 to 60000.000000 with resolution 1.000000 are 39

So 1 bucket more - so very small reduction of accuracy in the 50 to 60 seconds. If I go in bigger ranges the difference is once again 1-2 either way, so it seems okay.

Obviously if you go from 0 to 1 the difference is huge :

buckets total in range 0.000000 to 1.000000 with resolution 0.001000 are 506

vs

buckets total in range 0.000000 to 1.000000 with resolution 1.000000 are 1

I think we can tl;dr it to "there is no lost of resolution in the values above 1" while there is significant improvement in the ones between 0 and 1

edit: p.s. the code was added to hdr_test.go as the easiest way to use the bucket index generation fucntion without exporitng it and using it in a main package.

output/cloud/expv2/hdr.go Outdated Show resolved Hide resolved
@codebien
Copy link
Contributor Author

codebien commented Aug 30, 2023

Hey @olegbespalov @mstoykov I've pushed a new fixup for fixing #3302 (comment). It was a mistake because we are checking the max value after that the resolution is applied, so the value will be checked already as the final value.

Can you take another look, please?

Copy link
Contributor

@olegbespalov olegbespalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM 👍

But I prefer to do a clean-up of the tests (mentioned in the comments) in the current PR to avoid future confusion.

However, I don't know how urgent is merging this, so approving it.

h := newHistogram()
// TODO: refactor
// An hack for preserving as the default for the tests the old value 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, it seems right now is the best moment to refactor this place to avoid future confusion.

for _, v := range tc.vals {
h.Add(v)
}
tc.exp.MinimumResolution = 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's better to keep this along with the other expectations

@@ -161,57 +180,84 @@ func TestHistogramAddWithMultipleOccurances(t *testing.T) {
Sum: 466.20000000000005,
Count: 5,
}
exp.MinimumResolution = 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not define it in the struct?

Count: 3,
MinimumResolution: 1.0,
}
h.MinimumResolution = 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it already done on the line 210? (straight after h := newHistogram()) 🤔

@codebien
Copy link
Contributor Author

codebien commented Aug 31, 2023

Hey @olegbespalov, we decided to make the min_resolution required instead of optional, so I need to rework those tests anyway. I will address your valuable suggestions directly in the new PR (that I will push today, maximum tomorrow). I would prefer to not block this PR so I can unlock the browser for testing the full picture.

I will reference the comments here directly in the new PR.

@olegbespalov
Copy link
Contributor

@codebien Sure, go ahead and merge it 👍

@codebien codebien merged commit fd6551d into master Aug 31, 2023
20 checks passed
@codebien codebien deleted the cloudv2-hdr-higher-res branch August 31, 2023 10:28
oleiade pushed a commit that referenced this pull request Sep 8, 2023
We increased the resolution for the histogram using a different min_resolution value. It increases the range of supported values. Now, it supports decimal values between 0 and 1. min_resolution is a factor that scales the ingested value, in this way, with the set value of 0.001, it supports 3 decimal digits where before there was only the option for aggregating integer values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants